| The Problem |
| The Java Solution |
boolean hasMoreElements()and
Object nextElement() .In fact the implementation of the first method is trivial, and the second is only slightly more interesting. Since the standard Java class DataInputStream detects the end of a file using exceptions, we simply return true every time the first method is called:
public boolean hasMoreElements() {
return true;
}
The second method is also quite simple. We attempt to read in a line of text, and catch any exceptions which would indicate reading past the file's end:
public Object nextElement() {
try {
return in.readLine();
} catch (IOException ex) {
throw new NoSuchElementException();
}
}
For simplicity, we convert all I/O errors into the single exception
indicating that no more elements may be read from the enumerator.
The two programs Caps and Adder each use the Lines class to read successive lines. The main routines of the two classes have similar structure. In each, we begin by initializing the enumerator with the standard keyboard input. The main loop of the program consists of reading and manipulating the input. To fetch the input we use the nextElement method, whose result is of type Object and which must be converted to String by the narrowing cast
line = (String)nextElement
Then after reading the input in each iteration, we perform some manipulation to the input, and use it to update some part of the state of the system.
The loops are exited upon reading a blank line. Following each loop,
the methods may each have some shutdown
operations. In other words, each main routine has the rough
form
public static void main(String[] argv) {
Enumeration lines = new Lines(new DataInputStream(System.in));
while (lines.hasMoreElements()) {
String line = (String)lines.nextElement();
if (nonEmpty(line)) {
update(line.manipulate());
} else {
break;
}
}
shutdown();
}
In the main routine of the Caps class, the manipulation is with the toUpperCase method, the state update is by printing (thus changing the state of the global output), and the shutdown operation is empty:
public static void main(String[] argv) {
Enumeration lines = new Lines(new DataInputStream(System.in));
while (lines.hasMoreElements()) {
String line = (String)lines.nextElement();
if (nonEmpty(line)) {
System.out.println(line.toUpperCase());
} else {
break;
}
}
/* No shutdown operation */
}
In the main routine of the Adder class, the manipulation is by converting the string to a number, the state change is addition into a an accumulator variable, and the shutdown operation is to print the final value of the accumulator:
public static void main(String[] argv) {
Enumeration lines = new Lines(new DataInputStream(System.in));
int sum = 0;
while (lines.hasMoreElements()) {
String line = (String)lines.nextElement();
if (nonEmpty(line)) {
sum = sum + Integer.parseInt(line);
} else {
break;
}
}
System.out.println(sum);
}
The coding of this example is somewhat awkward, a point we expand upon in a footnote. The coding style anticipates the discussion below by bringing several issues together in a textually small space. The footnote shows how a less contorted coding style will still allow the same benefits in the transition from Java to Pizza.
| The Pizza Solution |
One possible solution is creation of additional subclasses of the Lines enumerator which override the appropriate methods. While effective, this approach has two drawbacks. First, such extra classes greatly increase code size, much of which is again duplication, even though one of our original criticisms was the size and duplication within the original code. It is likely that the new classes will be used only once each, meaning that the extra code length gives little gain in functionality.
A second drawback which also affects code readability and maintenance is that these new classes are textually separated from the main routines which use them. Understandability is decreased since the reader must flip back and forth between classes to determine the overall effect of a piece of code; maintenance is more difficult since one must consider the effects of changing a much greater amount of code which may or may not be used in yet more places in the program.
A final minor problem with the Java solution is the need for a type cast. Since it would be difficult for the system to decide the subclass of Object to which the result should be expressed, we must make this conversion explicit. Such conversions are standard under the Java type system, but we will see how the proper use of type polymorphism can make such casts unnecessary.
The Pizza solution makes use of the abstract class pizza.util.Enumerator, which extends the standard pizza.util.Enumeration interface. The Pizza version of both classes are parameterized with nextElement()'s result type. Hence, the narrowing casts of the Java solution are no longer necessary. Aside from the type polymorhism, the basic Pizza Enumeration class is essentially identical to the original Java version.
It is in the Enumerator class that we make best use of Pizza's more expressive constructs. In the Java solution, we found it necessary to repeat common control patterns in the two main routines. In Pizza, we can standardize the control pattern, and pass the particular behavior as a parameter. The two main routines which we consider here exhibit three distinct flavors of abstracted behavior: the termination condition for the loop, the conversion from the original input lines and the state manipulation based on the converted input.
All three flavors are similarly derived. The Pizza version of the plain Lines enumerator features additional methods which returns new enumerators computed as a variation on a given enumerator object. For example, the basic Lines class specifies a trivial routine for the hasMoreElements method, but the termination condition in our two examples is different. We can use the takeWhile method to derive a new enumerator whose behavior is just the same as the originally instantiated Lines object, but with a different termination condition.
We specify the termination condition using a first-class
function value. In a language such as Java where function values
are not considered first-class, an object's methods may be accessed in
only one way: by calling them. For example, in both of the main
classes of our example we have a static nonEmpty method which
indicates when we have received all of the input we wish to consider.
In Java, we can only call this method, but in Pizza, we can also pass
the uncalled method as a parameter when we call another method. Thus
where
We implement the second flavor of behavior by another method which
receives a function value. The enumerator map method takes a
function f and returns a new enumerator whose output is
the result of applying f to the original input. That is, if
the originally instantiated enumerator
In the Pizza version of the Caps class, we wish to
capitalize each element of the input. We have a static method
toUpperCase which performs this manipulation, and we pass
this function as the argument of map. So the call
For the Adder class, we take advantage of another aspect
of first-class functions in order to produce an even more concise main
routine. In Caps, we explicitly named the function
which we passed to map. If we were passing, say, a constant
integer or string to a method we might consider it excessive to give
the constant a name, especially if that use of the constant were
independant from all other uses of the same constant in the program.
But since we consider functions to be first-class citizens of our
language, we can use a function value without giving it a
name, just as we can do with any other value.
Function values are created using the fun keyword. The
expression
The final flavor of behavior concerns what we do with the
enumerated values. The map method alters each enumeration,
but does not actually run through the enumerated values. A call to
the function named by map is made only when each element is
requested, but mapping a function onto an enumeration does not by
itself cause the program to loop through the enumerated elements,
disposing of each as appropriate.
In the Caps class, we want to print each capitalized line
to the standard output. Again, we define a static method
println which simply passes its value to the standard system
routine for text output:
The enumerators in the Adder class are processed slightly
differently: for Caps, each line of input affects only the
corresponding line of output. Once we have printed the capitalized
line, we can safely discard it. In Adder, each number
does affect subsequent behavior. The numbers are to be
summed, and only the final output printed out. Intuitively, we want
to take the series of enumerated values
The intuitive description leaves two unanswered questions: should
the addition associate to the right or left, and how should the
replacement behave when there are no numbers in the enumeration.
Addition of course follows the associativity axiom
For the second question, we will expect an initial value to be
passed to the behavior abstractor along with the behavior function.
When the enumeration has no elements, the initial value will be
immediately returned; otherwise we begin with this value when we add
(or otherwise operate) the elements of the operation. So the two
possible reduce methods would transform an enumeration
Students of design patterns will find much of this
discussion familiar. Behavioral patterns such as Iterator, Strategy
and Template Method are concerned with exactly the sort of pattern
discussed in this example: higher-order functions and the separation
of a typical flow of control from the particulars of specific
instances of that flow. By augmenting simple object-orientation with
first-class functions and a richer notion of type polymorphism, we can
more readily identify such patterns, and express them in much more
concise code.
new Lines(new DataInputStream(System.in))
returns an instantiation of the Lines class to read lines of
standard input until an exception is raised,
new Lines(new DataInputStream(System.in))
.takeWhile(nonEmpty)
instantiates a second enumerator based on the first which produces
output only when nonEmpty returns true. In other
words, on the first element where nonEmpty returns
false, the modified hasMoreElements will return
false and further calls to nextElement will raise an
exception - both whether or not there are more lines to read on the
standard input.
new Lines(new DataInputStream(System.in))
.takeWhile(nonEmpty)
returns some value V at a particular call, then the
enumerator
new Lines(new DataInputStream(System.in))
.takeWhile(nonEmpty)
.map(f)
returns f(V) at the corresponding call.
new Lines(new DataInputStream(System.in))
.takeWhile(nonEmpty)
.map(toUpperCase)
returns capitalized versions of successive lines of input.
fun(String s) -> int { return Integer.parseInt(s); }
denotes a function which maps a single parameter s of type
String to the integer value expressed in digits by that
string. So where the enumerator
new Lines(new DataInputStream(System.in))
.takeWhile(nonEmpty)
will give lines of string input until an empty string is read, the
modification
new Lines(new DataInputStream(System.in))
.takeWhile(nonEmpty)
.map(fun(String s) -> int { return Integer.parseInt(s);
})
will convert each string to the appropriate integer before returning
it.
static void println(String s) {
System.out.println(s);
}
We can then pass this routine to the enumerator method
forall. Once again, this method expects a function as its
parameter. The function should accept values of the enumerated type -
here, String - and should not return anything. That is, the
result type of the function should be void. So where the
enumerator
new Lines(new DataInputStream(System.in))
.takeWhile(nonEmpty)
.map(toUpperCase);
returns successive capitalized lines of input, the call
new Lines(new DataInputStream(System.in))
.takeWhile(nonEmpty)
.map(toUpperCase)
.forall(println);
will read each line of input, passing each in turn to the
println function. The type of this latter expression is
void: we have told the compiler when to stop generating
items, how to adjust each one, and what to do with the adjusted
versions, so it can process the enumeration without further
instructions.
V1,
V2,
V3,
... ,
Vn-1,
Vn.
and replace each comma ``,'' with an addition symbol ``+''
V1 +
V2 +
V3 +
... +
Vn-1 +
Vn.
and then print the result.
(x + y) + z = x + (y + z) ,
but associativity does still matter, first since passed functions may
have side effects which are not associative, and second since we want
our abstraction to be more generally applicable. So we answer the
first question by simply providing two such reduce methods,
one for each associativity.
V1,
V2,
V3,
... ,
Vn-1,
Vn.
and default value DEFAULT into either a left-associative
reduction
((( ... (((DEFAULT +
V1) +
V2) +
V3) +
... ) +
Vn-1) +
Vn)
or a right-associative reduction
(V1 +
(V2 +
(V3 +
( ... +
(Vn-1 +
(Vn +
DEFAULT)) ... )))) .
So where the enumeration
new Lines(new DataInputStream(System.in))
.takeWhile(nonEmpty)
.map(fun(String s) -> int { return Integer.parseInt(s);
})
generates integers corresponding to those typed as digits on the
standard input, the call
new Lines(new DataInputStream(System.in))
.takeWhile(nonEmpty)
.map(fun(String s) -> int { return Integer.parseInt(s); })
.reduceLeft(0, fun(int x, int y)->int { return x + y;
}));
will sum the integers, beginning with the initial subtotal 0 (zero),
returning the final result. We want to print this final total out, so
the call
System.out.println(
new Lines(new DataInputStream(System.in))
.takeWhile(nonEmpty)
.map(fun(String s) -> int { return
Integer.parseInt(s); })
.reduceLeft(0, fun(int x, int y) -> int { return x +
y; }));
completes the main routine.
The methods map, takeWhile, forAll and
reduceLeft are known as higher-order functions, since they
are functions which accept (and could conceivably return) other
function values. The higher-order methods we provide with the
Enumerator class are modelled after the corresponding
functions in a list class of a typical functional language.
Enumerator objects are very similar to the lazy list
data structures of nonstrict functional languages such as Haskell.
Data structures in many languages are strict: the entire structure is
immediately computed. On the other hand, both Enumerator
sequences and lazy lists are computed only one node at a time, and
only when each element is explicitly required. A more thorough
discussion of the applications of nonstrict data structures may be
found in most textbooks on functional languages, for example that of
Bird
and Wadler.
Some Final Remarks
Sources for the example programs are found in the files
pizza/src/examples/lineinput.{java,pizza}
of the pizza distribution (version 0.28b or higher).
Sources for the library modules are found in the directory
pizza/src/pizza/util/
Page design & maintenance: Martin Odersky and John Maraist.
Java is a trademark of Sun
Microsystems.